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A technical report on hitting times, mixing and cutoff 


•Jonathan Hermon * 


Abstract 

Consider a sequence of continuous-time irreducible reversible Markov chains and a 
sequence of initial distributions, /x n . Instead of performing a worst case analysis, one 
can study the rate of convergence to the stationary distribution starting from these 
initial distributions. The sequence is said to exhibit (total variation) ^-cutoff if the 
convergence to stationarity in total variation distance is abrupt, w.r.t. (with respect 
to) this sequence of initial distributions. 

In this work we give a characterization of /r n -cutoff (and more generally, total- 
variation mixing) for an arbitrary sequence of initial distributions ji n (in the above 
setup). Our characterization is expressed in terms of hitting times of sets which are 
“worst” (in some sense) w.r.t. /i n . 

Consider a Markov chain on Q whose stationary distribution in it. Let tn(a) := 
maXj-gsj, J 4 C f 2 :7r (A)>a be the expected hitting time of the set of stationary prob¬ 

ability at least a which is “worst in expectation” (starting from the worst starting 
state). The connection between inG) and the mixing time of the chain was previ¬ 
ously studied by Aldous and later by Lovasz and Winkler, and was recently refined 
by Peres and Sousi and independently by Oliveira. In this work we further refine this 
connection and show that ^-cutoff can be characterized in terms of concentration of 
hitting times (starting from fx n ) of sets which are worst in expectation w.r.t. /i n . Con¬ 
versely, we construct a surprising counter-example which demonstrates that in general 
cutoff (as opposed to cutoff w.r.t. a certain sequence of initial distributions) cannot be 
characterized in this manner. 

In addition, we prove a decomposition theorem which asserts that for reversible 
Markov chains on a finite state space, mixing with respect to some relaxations of L°°- 
mixing and separation mixing is in fact equivalent to total variation-mixing. 

Finally, we also prove that there exists an absolute constant C such that for any 
reversible chain e(tn(e) — £h( 1 — e)) < C^reil loge|, for all 0 < e < 1/2, where t re \ is the 
inverse of the spectral gap of the chain. 

Keywords: Mixing-time, hitting times, cutoff, finite reversible Markov chains, maximal inequal¬ 
ity, counter-example. 
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1 Introduction 


This work is a continuation of [3], in which Starr’s maximal inequality was used to character¬ 
ize the cutoff phenomenon for reversible Markov chains in terms of concentration of hitting 
times. Here using the same technique we present several new related results. 

Generically, we shall denote the state space of a Markov chain by D and its stationary 
distribution by 7r (or Q n and 7r n , respectively, for the n-th chain in a sequence of chains). We 
say that the chain is finite, whenever D is finite. Let be an irreducible Markov chain 

on a finite state space D with transition matrix P and stationary distribution n. We denote 
such a chain by (0, P, tt). A chain (0, P, n) is called reversible if 7r(x)P(x, y ) = 7i(y)P(y, x), 
for all x,y G fh 

Periodicity issues can be avoided by considering the continuous-time version of the chain, 
(A" t ) t >o. This is a continuous time Markov chain whose heat kernel is defined by H t (x,y ) : = 
Ylh=o ^rpP*^, y)- If is a classic result of probability theory that for any initial condition 
the distribution of X t converges to n when t goes to infinity. The object of the theory of 
Mixing time for Markov chain is to study the characteristic of this convergence (see [6] for a 
self-contained introduction to the subject). Throughout, we shall consider only continuous 
time chains, although all our results can be stated also in discrete time, assuming P(x, x) > 5 
for some 5 > 0 for all x G fl. 

We denote by H/ (H M ) the distribution of X t ((Xt)t> 0 ), given that the initial distribution 
is y. When y = S x (where S x (y) = l x = y ), for some x G D, we simply write Hj, (H x ). 

We denote the set of probability distributions on a (finite) set B by A^(P). For any 
y, v G ^(P), their total-variation distance is defined to be 

\\y - vWtv := ^^2\y( x ) - v( x )\ = y(x)~v(x). 

x xGB: fj,(x)>v(x) 

The worst-case total variation distance at time t is defined as d(t) := rna x xe nd x (t), where 
for any y G «^(f2), 

dp{t) ■■= ||H m (W G •) - ttIItv = ||- 7t||tv = \ ^2y(x)Ht{x,y) - n(y)\. 

The e-mixing-time is defined as 

fmix(e) := inf {t : d(t) < e} . 

We also define the e-mixing-time w.r.t. a fixed initial distribution y to be t m iX)M (e) : = 
inf {t : d^(t) < e}. When e = 1/4 we simply write f mix and t m ix i/U . 

Recall that if (fl, P, n) is a finite reversible irreducible chain, then P is self-adjoint 
w.r.t. the inner product induced by 7r (see Definition 2.1) and hence has 101 real eigen¬ 
values. Throughout we shall denote them by 1 = Ai > A 2 > ... > A|^| > — 1 (where A 2 < 1, 
by irreducibility). Define the relaxation-time of P as t re \ := (1 — A 2 ) _1 . The following 
general relation holds for reversible chains, 

Pei log ( tt ) < log ( - ■ 1 , M Pel, (1-1) 

\2tJ \enwn x 'K{x) J 
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(see [6] Lemmas 20.5 and 20.11). 

The following mixing parameter, introduced in [3], shall play a key role in this work. 

Definition 1.1. Let (fl, P, tt) be an irreducible chain. For any p £ &(P), S,e £ (0,1) and 
t > 0, define p^{d,t) := max BC Q . T (^)>5 LLfiT B > t], where T B := inf {t : X t £ B} is the 
hitting time of the set B. Set p(5,t) := ma x x£ np x (5,t). We define 

hit, 5 iM (e) := minjf : p^(5,t) < e} and hit^e) := minjf : p(S,t) < e}, 

Next, consider a sequence of such chains, ((Q n , P n , 7i n ) : n £ N), each with its correspond¬ 
ing worst-distance from stationarity d n (t), its mixing-time etc.. Loosely speaking, the 
(total variation) cutoff phenomenon occurs when over a negligible period of time, known 
as the cutoff window, the (worst-case) total variation distance (of a certain finite Markov 
chain from its stationary distribution) drops abruptly from a value close to 1 to near 0. In 
other words, one should run the n-th chain until time (1 — o(l))t ffi Y for it to even slightly 
mix in total variation, whereas running it any further after time (1 + o(l))t ffi Y is essentially 
redundant. Formally, we say that a sequence of chains exhibits a cutoff if the following 
sharp transition in its convergence to stationarity occurs: 

t (n) fe) 

lim , = 1, for every 0 < e < 1. 

”- ,0 ° C(1 - <0 

Similarly, for a sequence of initial distributions p n G A^(f2 n ), we say that a sequence of 

chains exhibits a cutoff if 

t (n) (e) 

lim mix,/Jn - _ ^ f or ever y o < e < 1. 

,woo ^U(l-e) 

We say that the sequence has a cutoff window w n , if w n = o(f|” x ) and for any e G (0,1) 
there exists c e > 0 such that for all n 

- fit 1 - e ) < c eWn ^ 

(respectively, t^ n (e) - 1 - e) < c e w n ). 

We say that a family of chains satisfies the product condition if (1 — A^)^* —> oo as 
n —» oo (or equivalently, t^ = o(t ( fj x )). The following well-known fact follows easily from 
the first inequality in (1.1) (c.f. [6], Proposition 18.4). 

Fact 1.2. For a sequence of irreducible reversible Markov chains, if the sequence exhibits a 
cutoff, then t^ = o(t^ x ). 

Definition 1.3. Let (£2 n , P n ,7r n ) be a sequence of irreducible chains and let a £ (0,1). We 
say that the sequence exhibits a hit a - cutoff (resp. hit a ^ n -cutoff), if for every 0 < e < 1/4, 

hit<")( £ )-h,t(")(l- £ ) = 0 (hit<")(l/4)) 

(: respectively. hitfj^e) - liitgjjl - e) = o (hitj.^Jl/4))). 
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The main abstract result in [3] is the following theorem. 

Theorem 1. Let (Ll n ,P n , 7r n ) be a sequence of reversible irreducible finite Markov chains. 
The following are equivalent: 

(1) The sequence exhibits a cutoff. 

(2) The sequence exhibits a hit a -cutoff for some a G (0,1) and t^ = o(t^ x ). 

Definition 1.4. Let /a G T) and 0 < a < 1. Define 

tu Ja) '■= max EJTJ and tuM := max E x \Ta]. 

Ac^: 7r(A)>a n(A)>a 

This work was greatly motivated by the results of Peres and Sousi in [9]. Similar results 
were obtained independently by Oliviera [8]. Both papers refine previous results of Aldous 
[1] and of Lovasz and Winkler [7]. Their results share the general theme of describing 
mixing-times in terms of hitting-times. Their approach relied on the theory of random 
times to stationarity combined with a certain “de-randomization” argument which shows 
that for any reversible irreducible finite chain and any stopping time T such that X T ~ n, 
fmix = 0(max ieS ) £^[7]). As a consequence, they showed that for any 0 < a < 1/2 (this 
was extended to a = 1/2 in [5]), there exist some constants c a ,c' a > 0 such that for any 
reversible irreducible finite chain 


A>/h(g 0 A fmix ^ Catn(of). (1.3) 

It is natural to ask whether the more studied mixing parameter tn(a) could be used in 
Theorem 1 instead of the mixing parameter hit a (-). 

The following theorem extends Theorem 1 to arbitrary starting distributions /i n G ^(Q n ), 
such that tj// = o(t[fl, IJn ). In addition, it asserts that “cutoff’ w.r.t. these initial distribu¬ 
tions (i.e. /z n -cutoff), is in fact equivalent to concentration of hitting times of sets which are 
“worst in expectation” w.r.t. these initial distributions (in the sense of Definition 1.4). 

Theorem 2. Let (Ll n ,P n ,it n ) be a sequence of finite irreducible reversible chains. Let fi n G 
LPlfiln) be such that t ^ = o(t^ x ^ n (6)), for some 0 < 5 < 1. Then the following are 
equivalent: 

i) The sequence exhibits a / i n -cutoff. 

ii) There exists some a G (0,1) such that the sequence exhibits a hit Q , iMn -cutoff. 

Hi) There exist some a G (0,1) and a sequence of sets A n C D n with n n (A n ) > a satisfying 
that = t^ n (a), such that 

lim H Mn [| T An - \ < cE^ITaJ] = 1, for every e > 0. 

n—>oo ^ 

Corollary 1.5. Let (Q n , P n , 7r n ) be a sequence of finite irreducible reversible transitive chains. 
Then the following are equivalent 
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i) The sequence exhibits a cutoff. 

ii) The sequence satisfy the product condition, and for some sequence x n G Lt n and some 

0 < a < 1, there exists a sequence of sets A n C with 7 r n (A n ) > a satisfying that 
ExJTaJ = SUch that 

lim H Xn [| T An - t\fl x | < eE Xn [T An ]} = 1, for every e > 0. 

n—>00 

The following proposition asserts that in general cutoff (as opposed to cutoff from a se¬ 
quence of fixed initial distributions) cannot be characterized in terms of the mixing parameter 
tn(a)- 

Proposition 1.6. There exists a sequence (Q n , P n , 7r n ) of finite irreducible reversible chains 
satisfying the product condition such that the following holds: 

■ There exist A n C Ll n and x n G Ll n such that ir n (A n ) > 1/2 and E Xn [TA n \ = % (1/2). 

■ The distribution of the hitting times of A n are concentrated w.r.t. the initial states x n . 

■ The sequence does not exhibit a cutoff. 

In Example 5.1 we construct an sequence of chains which exhibits the behavior described 
in Proposition 1.6. 

Remark 1.7. It was shown in [4] that a sequence of finite continuous-time Markov chains 
exhibits a cutoff iff t^f\e) —t^£\l — e) = (1/4)), where t^£\e) is the e-mixing-time of the 

associated lazy chain. They also showed that the same holds for a sequence of fixed initial 
distributions. Hence in part (i) of Theorem 2 and of Corollary 1.5 we could have considered 
the lazy version of the chain, rather than its continuous-time version. 

The main ingredient in the proof of Theorem 2 is the following proposition 

Proposition 1.8. Let (Q,P,n) be a finite irreducible reversible Markov chain. Let /i G 
&{{1). Let 0 < e < 1/2. Then 

hiti/ 2jM (3e/2) -2f rel |loge| <t mix ^(e) < hit i/ 2 , M (3e/4) + f re i log (16/e), 
hiti /2 , M (l - e/2) - 2f rel | loge| <t m iXlAt (l - e) < hiti /2>#1 (l - 2e) + ^f re ilog8. 

The separation of y G £2 w.r.t. /j G is defined as 1 — ix(y)/n(y). An important 

notion of distance from stationarity, intimately related to total variation distance, is the 
separation distance of // (from 7 r) defined as := max^^l — p(y)/n(y). Generally, 

\\p — 7r|| T y < s M , for any /i G T) (see e.g. Lemma 6.13 in [6]). 

Another notions of distance from stationarity are the L°° and L p distances (1 < p < 00 ), 

defined, respectively, as ||//-7r|| 0O) , r := max y |^-1| and \\p-Tr\\ p ^ := - l) p 

By Jensen’s inequality, 

2\\p - 7r|| tv = WfJ. - 7 r|| 1;7r < \\p - 7 r||p i7r < ||/i - 7r||oo^, for any 1 < p < 00 . 
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The e separation and L p mixing times are defined as t se p (e) = inf {t : max x s H ‘ < e,} and 
T p (e) = inf{t : max x ||H X — vr|| Pi7r < e}, for 1 < p < oo. Then 

tmix(e) < tsep(e) and t m i x (e/2) < r p (e) < r p /(e), for any 0 < e < 1 and 1 < p < p' < oo. 

In general, one always has that t sep (e) < 2t mix (e/4) (c.f. [6] Lemma 19.3). Conversely, 
in many cases tsep(l — o(l)) > (2 — o(l))t^ x (e.g. lazy simple random walk on the n- 
dimensional hypercube, see Theorem 18.8 in [6]). The L 2 mixing-time in many cases satisfies 

^mbt = 0 ^T2^(l/2)^. The following theorem gives a surprising quantitative converse, when 
one replaces the separation and L p distances by weaker notions of distance from stationarity 
which have a similar from. 

An alternative definition of sw is that it is the smallest number a such that one can 
write H x = au + (1 — a)n. A natural relaxation of this can be looking for a small a such 
that one can write H x = au + (1 — a)/x for /x satisfying that n({y : p(y) < (1 — e)7r(x/)}) < e. 
Similarly, one can require /x to be a mixture of the form 7 c^a, , for some collection of sets 
(A)iei such that 7r(Aj) > 1 — e for all i, where Q = 1 and tta denotes tt conditioned on A 
(i.e. 7 ta(v) = 1 pe A7r(x/)/7r(A)). In the same spirit, a relaxation of the L p distance from n is 
to look at a small a such that one can write H x = au + (1 — a)/x for /x such that ||/x — 7r|| p < e. 

This following theorem, which we call the Decomposition Theorem , asserts that for 
t which is slightly larger than hit 1 _ e AX (p), one can indeed write H) ( as such a mixture, with a 
slightly larger than p. The statement of the Decomposition Theorem may seem cumbersome 
at first sight. The reader may find it easier to first think of p and e as constants and of w as 
tending to infinity. 

Theorem 1.9. Let be a finite irreducible reversible Markov chain. Let o e L?{VL). 

Let w G M and 0 < e,p < 1. Denote C e ^ p := \ log • Let 

r = T w ep := t Tei [w + C €jP ] and p = p P)W := pe~ w . 

Then for any w > —C e . p , there exist some 0 < c T < a T := p + p( 1 — p) and two distributions 
v = v T and p = p T such that the following hold 

H^ +T = c T v + (1 — c T )/x, where t := hit!_ ej0 .(p), 

(j,(x)/tt(x) < = 1 + Q(pe~ w ), for any xefl, (1.4) 

mid for any b e [0,1], 

tt {x : p{x) < (1 - &)vr(a;)} < x + ^b/p) 2 = ° • ( L5 ) 

Moreover, 

\\h - < (p/ 2 ) 2 lo S [! + (2/p) 2 ] + JyTffif = 0 i w P 2 ^ 2w ) • ( L6 ) 
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In addition, let d T := + pe w ! 2 /2 = 0 (pe w ^ 2 ). Then there exists n T < d T such that 


p := n T pi + (1 — K T )p 2 i where pi,p 2 £ and p 2 E p(A)7T A , 

Acii 

f e w 1 

where p is some distribution on < A C D : n(A) > -> . 

\ v ’ ~ 1 + e w J 


(1.7) 


Recall the definition of fn(e) from Definition 1.4. In [5] the following general inequality 
was proved (without a reversibility assumption). 

Theorem 1.10. Fix 0 < e < 1/2. For any irreducible finite Markov chain efn(e) < fn(l/2). 


We prove a specialized version of the above result for the reversible setup. In many 
cases the bounds obtained from Proposition 1.11 are considerably better than the bound in 
Theorem 1.10 (in particular, this is the case when the product condition holds). 


Proposition 1.11. There exists an absolute constant C > 0 such that for any finite reversible 
irreducible chain (TI,P,tt), 

fH,^(e) — ^H,/i(l — e) < Ce _1 f re i, for every 0 < e < 1/2 and p G &(Q). 


1.1 A remark about our approach 

The approach taken here for relating ,,(■) and hit. ;M (-) follows that taken in [3]. Namely, 
we define for any B C D, the set G = G s (B,m), which we call the good set for B from 
time s within m standard-deviations. This set is defined formally in (2.3). The reason for 
the name would become clear once this set is defined formally. Fix some 0 < e < 1 and set 
m = m e := 2/a Je. Fix some initial distribution p. From the definition of G, it would be 
easy to see that for any t > 0, hitting G by time t serves as a “certificate” that the chain is 
”e-mixed w.r.t. B“ at time t + s. 

If for some s = s e we had that i t(G s (B, m e )) > 1 — e for all B cO, then if by time t any 
set of stationary probability at least 1 — e is hit with probability at least 1 — e given that 
X 0 ~ p (i.e. t > hiti_ £j/i (e)), then by using the certificate Tb s (B,m e ) < t and then maximizing 
over all B we get that t m ix , Ai (2e) <t + s. 

As in [3], we shall use Starr’s maximal inequality (Theorem 2.3) in conjunction with the 
L 2 contraction Lemma (Lemma 2.2) to show that there exists an absolute constant C > 0 
such that ir(G Se (B,m e )) > 1 — e for all B , where s e := Ct re \\ loge|. 

In order to relate t m «,//(•) and fu,//(•)> we re late fH,^(«) and hit a+£j/i (-) by showing that 
for any e,a,p > 0 such that e + a < 1 and e + p < 1, if 7 t(A) > a is “worst in expectation” 
(tn,n(oi) = E^Ta]) then hit a ^(p + e) < min {t : R^[T A > t\ < p} + C e , a f re i (see Lemma 4.7). 


2 Starr’s Maximal inequality 

In this section we present the machinery that will be utilized in the proof of the main results. 
The most important tool we shall utilize is Starr’s L 2 maximal inequality (Theorem 2.3). 
We start with a few basic definitions and facts. We denote Z + := {n G N : n > 0} and 
M+ := {t G M : t > 0}. 
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Definition 2.1. Let (12, P, tt) be a finite reversible chain. For any f G let E w [f] : = 
Y2xen n ( x )f( x ) an d Var^/ : = E 7r [(/ — E w f) 2 ]. The inner-product (•, -) n and L 2 norm are 

</.9>, := E,[/s] and ||/|| 2 := (E„[|/| 2 ]) 1/2 . 

We identify H t with the operator H t : L 2 (R n , 7r) —> L 2 (R n , -it), defined by 

Htf(x ) := ]T H t (x, y)f(y) = E x [f(X t )}, 

yen 

By reversibility H t is self-adjoint (w.r.t. (•, -) n ). 

The following lemma is standard and follows from elementary linear algebra using the 
the spectral decomposition of a function / G (see e.g. Lemma 20.5 in [6]). 

Lemma 2.2 (L 2 -contraction Lemma). Let {VL,P,ti) be a finite reversible irreducible Markov 
chain. Let f G Then 

Var nHtf < e _2t ^ rel Var 7r /, for any t > 0, and (2.1) 

We now state a particular case of Starr’s maximal inequality ([10] Proposition 3). The 
proof in the discrete time setup could be found in [3]. Let / G Define its maximal 
function by f*(x) := sup t > 0 \H t f(x)\. 

Theorem 2.3 (Maximal inequality [10]). Let (12, P, n) be a finite reversible irreducible 
Markov chain. Then for any f G 

ll/*l|2<2||/|| 2 . (2.2) 

For any B C 12 and s G M+, set p(B ) := = \/Vary 1/j, a s : = 

p(B)e~ s / tiel . Note that by Lemma 2.2, o s > We define the good set for B 

from time s within m standard-deviations to be 

G s (B,m ) := { y : |H*(P) — vr(P)| < mcr s , for all t > s} . (2.3) 

The motivation behind the definitions in (2.3) was previously explained in § 1.1. The fol¬ 
lowing corollary follows by combining Lemma 2.2 with Theorem 2.3. 

Corollary 2.4. Let (12, P, 7r) be a finite reversible irreducible chain. Then 

4 

7r (GABpm)) >1 - for all B C 12, s > 0 and m > 0. (2.4) 

rrr 

Proof. For any s > 0, let f s (x) := H s (l B (x) — vr(P)) = H*(P) — 7r(P). Then by Lemma 2.2 
and Theorem 2.3 

||/;|| 2 < 2||/ s || 2 < 2e~ s ^\\l B (x) - vr(P)|| 2 = 2u s . (2.5) 

For any t > 0, H t f s (x ) = ft+s(x ) = H(, +S (P) — 7r(P). Then, in the notation of Theorem 2.3, 

fs( x ) '■= SU P I H t f s (x) \ = slip |H f T (P) - tt(P)|. 

t>0 t>s 

Hence D := {x G 12 : ff(x) > mcr s } is the complements of G s (B, m). Thus by Markov 
inequality and (2.5) 

1 - 7T (G s (B,m)) = ir(D) = i r{(/*) 2 > (md s ) 2 } < 4 m~ 2 . 

□ 
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3 Inequalities relating t m i x ( ) and hit.(-) 


Our aim in this section is to obtain inequalities relating f mix (e) and hit / g(h) for suitable 
values of ft, e and 6 using Corollary 2.4. As was shown in [3], these two notions of mixing are 
intimately connected to each other. In this section we refine the analysis from [3]. Corollary 
3.3 below contains the more difficult half of Proposition 1.8. We end the section with a proof 
of the Decomposition Theorem. 

Lemma 3.1. Let (Q, P,tt) be a finite irreducible reversible chain. Let p G &(Q), e,p G (0,1), 
s > 0 and Bell. Denote p(B ) := Var^ls — tt(P)(1 — 7 t(B). Denote t £ jP := hiti_ ejM (p) and 
Ps, p ,e '■= 2(1 — p)e~ s ^ vel \Jl/e. Then 

t r(B) - H T f* +S (B) < pn(B) + Z s , P ,eVp(B) = Ptt(B) + 2(1 - p)e- s/t "'y/p(B)/e. (3.1) 

Proof. Consider the set 

H:={y: |H \(B) - n(B)\ < £ s ,p,eVp{B) for all t > s}. 


Then by Corollary 2.4, i t(H) > 1 — e. By the Markov property and the definition of H, 

H p [X Tep+s e B \ T H < r £)P ] >7 t(B) - £ StP!€ y/p(B). 

By the definition of t £)P and the fact that 77 (H) > 1 — e, we get that H ;i [T h < r £jP ] > 1 — p. 

Thus 

n(B) - H T ;- +S (B) < 77(B) - R,[X TetP+s eB,T H < r £ , p ] 

< H P [T H > t £)P \ti(B) + H P [T H < r £) p] {77(B) - R p [X Tep+s G B \ T H < r £)P ]) 

< H P [T H > T^ v ]i7(B) + H p [T H < T €jP ]£ s ^ e y/p(B) < P77(B) + £ s , P ,eVp(B). 


This concludes the proof of (3.1). □ 

Definition 3.2. Let 0 < r,p < 1. Denote D r = D t)Sp , r ,p, '■= {x : H( t +S (a;) < (1 — p — r)7r(x)}. 

Corollary 3.3. Let (12, P, 77 ) be a reversible irreducible finite chain. Let p G ^(12), w G M 
mid p,e G (0,1). 

/!/ Let 

s = s(w,e,p) := t re i[ic + /c £)P ], where k £ ^ p := log (2e _1 / 2 (l — p)) . (3.2) 

Then 

Cnix, M (p + 2e~ w ) < hiti -e, p (p) + S, for all w > -k £ . p . (3.3) 

(2) For every 0 < e < 1/2, 

t m ix,/i(e) < hiti/ 2 ,p(3e/4) + f re ilog(16/e), 

1 (3-4) 

Cnix, M (l - e/2) < hiti_ £iM (l - e) + -f re i log(64/e) and 

hiti_ e/4 ^(5e/4) < t mix , M (e) < hiti_ e/4)/i (3e/4) + ^f rel log(16/e). (3.5) 
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(3) (l)-(2) remain valid when p, is omitted. 

Proof. The proof of (3) is parallel to that of (l)-(2) and thus omitted. We first prove (3.3). 
Denote t := hiti_ eiM (p). First observe that from the definition of D r := D ts ^ Tpri , we have that 

(l-p-r).(D r )-H»[D r ]>0. 

Fix some w > —k €jP = — log (2e _1 / 2 (l — p)). Then s = s(w, e,p) = t re \[w + k €jP ] > 0. By the 
definition of k e>p we get that 2e~ ke ’ p (l — p)\J 1/e = 1. By the definition of s we have that 
4vp,e := 2e _s / trel (l — p)\Jlje = e~ w . By (3.1) we have that 

0 < (1 - p - r)n(D r ) - H ^ s [D r \ < -m(D r ) + e~ w y/n(D r )(l - vr(D r )). 


Let c > 0. Consider the function h : [0,1] — > M defined as h(x) = —cx + x{l — x). Then 
/ > 0 iff x < (1 + c 2 ) -1 . Taking c = re w and x = Tr(D r ) we get that 


7r(D r ) < 


1 

1 + r 2 e 2w 


(3.6) 


Define / : D —)• M by f(x) := ^(1 - p) - l^eDo- Then 0 < / < 1 - p and 

{/ > r} = D r , for any 0 < r < 1 — p. Note that 


(1 — p)tt(Do) — H^ +S [D 0 ] = Ejr/ = 


= e 


r(l-p)e w 

l r= o l + £ 2 


"1 -V 


7r{/ > r}dr < 


"i-p 


dr 


<r =0 


=n 1 + r 2 e : 


' r=0 


2&2w 


de 


< e 


“(1 -P)e u 


/ r=0 


2 di 


(3.7) 


(1 + ^) 


< 2e” 


2 — 


Let i? := {x G D : (1 — p) tt(x) < H( t +S (a;) < 7r(x)}. Then by definition we have 
that 7 t(B) — H| ( +S (i?) < pir(B). Denote C = B U D 0 . Then by the definition of the total 
variation distance (first equality) together with (3.7) (second inequality) and the fact that 
7 t(B) — H l p s (B) < pn(B) (second inequality) we get that 


|Ht + * - 7t||tv = AC) - H ‘+'(C) = AS) - H‘A‘(B)+pAD 0 ) + (1 -p)ADo) - 
< pir(B) +pir(B 0 ) + 2e _ " < pn(C) + 2e~ w <p+ 2e~ w . 


This concludes the proof of (3.3). 

The first inequality in (3.5) follows directly from the definition of the total variation 
distance. To see this, let E C D be an arbitrary set with 7 t(E) > 1 — e/4. Let t\ := t m i XiM (e). 
Then H ;j [T A < t\] > H ;i [X tl G E\ > n(E) — ||H^ — 7r|| TV > 1 — 5e/4. In particular, we get 
directly from Definition 1.3 that hit 1 _ £ / 4i/i (5e/4) <ti — t m ^(e). 

The remaining inequalities are obtained by applying the previously established ones with 
particular choices of (w,e,p). We now specify the required choice of parameters for each 
inequality, leaving the necessary calculations to the reader. For the second inequality in 
(3.5), use (3.3) with ( w,e,p ) being (log(8/e), e/4, 3e/4). 

For the first inequality in (3.4), apply (3.3) with (w,e,p) = (log(8/e), 1/2, 3e/4). For the 
second inequality in (3.4), apply (3.3) with ( w,e,p) being (log(4/e), e, 1 — e). □ 
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Proof of Theorem 1.9: We start by recalling the relevant notation. We denote t : = 
hiti - e ,a(p), 

1 


w + 2 lo S I — 


/16\ 

\ep 2 ) 


T = t(w, e,p) := trel 

P — p(Pi w ) := pe-~ w and a T = a T {w,p ) := p + p( 1 — p). 
For every r G [0,1 — p], let 

Dr = Di T p ra \= {x : H^ +r (a;) < (1 - p - r)n(x)}. 

Denote its complement by B r . Then in the notation of (3.2) 

2 


2 exp 


-w — log 


r(w, e,p) = s ( w + log 
2 


p(l~p) 


p(l-p)J ’ 

= p(l-p)e~ w = (1 -p)p. 


e,p 


Then by (3.7) we have that 


c T := K +T [B 0 ] - (1 - p)n(B 0 ) = p + ((1 - p)n(D 0 ) - K +T [D 0 }) 

2 


< p + 2 exp 


-w — log 


= p + (1 -p)p = a 7 


(3.8) 


(3.9) 


K p(l-p)J\ 

We may write H( r +T as a mixture of two distributions of the following form. 

H^ +r := c T v + (1 - c T )/i, 

where v(x) := lx ^ B ° (H^ +T (x) — (1 — p)n(x)) and p is defined via the relation 

) , = H^(-) - c T v{-) = H^ +T (-)l. e Do + (1 -p)7r(-)1.6B 0 = H( r +r (-) A (1 - p)tt(-) ^ ^ 


1 — c T 


1 — C T 


1 — C T 


where a Ab := min(a, 6). By construction and (3.9) we have that p(x) < (^^-)7r(x) for any 
By (3.9) we know that c r < a T as desired. Thus 


max/x(x)/7r(x) = (1 — p)/(l — c r ) < 


1 — p 




1 — p 

l-a T 1 - (p + (1 - p)p) “ 1 - p 


< 


and so (1.4) indeed holds. 

Let By (3.9)-(3.10) 


p(x)/ir(x) 


Bf+ T (x)/-K{x) ^ 1 -p 
1 — c T 1 — C T 


(3.11) 


For any 0 < b < 1, denote E b := {x : p(x) < (1 — fe)7r(a;)}. Fix some 0 < b < 1. Note that 
by the second equality in (3.9) and the definition of Do, we have that 1 — c T < 1 — p. Thus 
1 — b < 1 < (1 —p)/(l — Cr). By (3.11) we have that if x e E b , then (1 — c T )p(x) = H^ +r (x). 
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Thus x G E b iff H^ +r (a:) < (1 — c T )(l — b)n(x) < (1 — p)(l — b)n(x). Denote r b := (1 — p)b. 
Note that 1 — p — r b — (1 — p){l — b). Hence E b C D rb . By (3.6) and (3.8) we have that 


vr (E b ) < 7i(D rb ) < 


1 

1 +r 2 e 2«’+21og( RI ^ Jy ) 


1 _ 1 
1 + 4b 2 p~ 2 e 2w 1 + (26/ p ) 2 


This establishes (1.5). 

Consider the function / : D —> M, f(x) — 



Note that 0 < / < 1 and that {f(x) > r} C E 

m({f > r}) < 


- l) l/i(x)< w (*)- Then by (1.4) 

i^ r . Hence by (3.12), 
r 

1 + (2r/p) 2 ' 


(3.12) 


(3.13) 


Thus 

C A 2rdr Ed 

E - f = L 2r,r{/ - r]dr - L 1 + (2 v/pY = (p/2) 2 L dr 108 (1 + 4rV " 2) dT 

= (p/2) 2 log [l + (2/p) 2 ] . 


Plugging the last estimate into (3.13) yields (1.6). 

Finally, we are only left to prove (1.7). Denote P 1 —pe ~ w ^ 2 /2 and 


F := {x 6 O : yu(a;) < /57 t(o:)} = E\_p. 
Consider /ix G ^(D) defined as follows, 

\p(x) - /?7r(a;)] l xeF c 


/xi(x) : = 


By (1.4) we get that 


k t ■.= Y \M-/3n(y)] < Y 

y£F c y^F c L 


i - p 

By (1.5) and our choice of f3 we get that 
vr(F) = < 


y(F c ) — (3 tt(F c ) 


P n(y) < 


p 


P <F C ) < 


p 


P) =d T . 


1 + 4(1 — P) 2 p~ 2 e 2w 1 + e w 

Consider p 2 (x) '■= ij7T (AEeFc+n(x)i xeF . Then by construction, p = K s p x + (1 — K s )p 2 . Any 
p' G SApPl) can be written as p' = 'F /A p'(A)7Ta for some distribution on subsets of D, p ', 
which is supported on |h C D : A Z> {x G D : yy/y = max^gQ yyyy} j. Tor p' = p 2 we get 
that 7r({a: G D : ^ = max^gQ ^y}) = ?r(F c ) > yy^- as desired. □ 
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4 Proofs of Theorem 2 and Propositions 1.8 and 1.11 

Let tia denote 7 r conditioned on A (i.e. 7T A (y) — ^y&ATr(y) /tt(A)). The following lemma and 
corollary are taken from [3] (Lemma 3.5 and Corollary 3.4). The proofs in [3] are given for 
the discrete-time case, but the necessary adaptations for continuous-time are explained in 
Section 5 ibid. 

Lemma 4.1. Let (Q,P,ir) be a finite irreducible reversible Markov chain. Let AC Q be 
non-empty. Let a > 0 and w > 0. Let B(A,w,a ) := {y : [tt(A)T a > utf re i] > a}. Then 

H n[T A >t]< 7T(A C ) exp 1 f° r an V 1 (4.1) 

In particular, 

7 T {B(A, w, a)) < 7 r(d c )e _w a _1 . (4.2) 

Proof: For (4.1) see [3]. Write B = B(A,w,a ) and t := /Wj. By (4.1) 

an(B) < 7i(B)A nB [T A >t]< H n [T A > t] < n(A c ) exp < ^{A c )e~ w . □ 

Corollary 4.2. Let (Q,P,tt) be a finite irreducible reversible Markov chain. Let y e PA(i 1) 
and 0 < e < 1/2. Denote s e := 2t re i log e|. Then 

hiti A/i (3e/2) < f miX;M (e) + and hiti /2 (l - e/2) < t mix>/1 ( 1 - e) + s e . 

Proof. Fix some 0 < e < 1/2. Take an arbitrary set A with it (A) > / and y G It 

follows by coupling of the chain with initial distribution H/ with the stationary chain that 
for all t > 0 

H ,[T a >t + s e }< d„(t) + R n [T A > s e ] < d„(t) + l -e- s ' 2t ^ < d,(t) + (4.3) 

where the penultimate inequality is a consequence of (4.1) and the choice of s e . Putting 
t = fmix./t (e) and t = f m ix,/ t (l — e) successively in the (4.3) and maximizing over A such that 
tt{A) > \ completes the proof. □ 

Proof of Proposition 1.8: The proof follows by combining Corollaries 3.3 and 4.2. 

Corollary 4.3. Let (Q,P,tt) be a reversible irreducible finite Markov chain. Let y G L?{VL). 
Then for any 0 < e < <5 < 1 and any 0 < ft < 7 < 1, 

hit 7>/1 (<$) < hit^tf) < hit 7)#i (<J - e) + /3 _ 1 t re 1 log ■ ( 4 -4) 

Proof. The first inequality in (4.4) is trivial. We now prove the second inequality in (4.4). Let 
A be an arbitrary set with it {A) > (3. Denote s := f3~H ie ilog Using 

the notation from Lemma 4.1, let D Ll \ B(A, s, e). Then by (4.2) with w = log , 

7 t(D) > 1 — e _1 7r(d c )e _ log ( > 7 . 
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Set t\ := hit 7i/i (5 — e). Then H fj,[T D > ti] < 5 — e. By the definition of D, 

H,,[T a > h + s | T d < ti] < max H ^[T A > h + s \ T D <t u X Td = d] 

d€D-.n fl [X TD =d]>0 

< max H fi \T A > si < e. 

deD 

Whence 

H h\Ta > ti + s] < H m [Td > ti] + R^[T a > ti + s | Tjj < 1 1 ] < (5 — e) + e = 6. 

Since A was arbitrary, this concludes the proof of (4.4). □ 

Proposition 4.4. Let (LL n , P n ,ir n ) be a sequence of finite irreducible reversible chains. Let 
fi n G &{Q. n ). Assume that = o(t^ xfln (5)), for some 0 < <5 < 1. Then (l)-(3) below are 
equivalent: 

(1) There exists some a G (0,1) such that the sequence exhibits a liit Q)/tn -cutoff. 

(2) For any a G (0,1) the sequence exhibits a hit Q)/tn -cutoff. 

Moreover, for any a G (0,1), 

(1 - °(l))t£L.W < Mt$i0/2) < (1 + (4.5) 

Furthermore, if (2) holds then 

lim hit^ n (l/4)/hit^ (1/4) = 1, for any a G (0,1). (4.6) 

Proof. First assume that (4.5) indeed holds. Then = o(hit^ n (<5/2)), for any a G (0,1). 
This in conjunction with Corollary 4.3 implies (4.6) and the equivalence between (l)-(2) 
(c.f. the proof of Proposition 3.6 in [3]). We now prove (4.5). 

The first inequality in (4.5) follows from (3.3) and the assumption that fj/ 4 = o(t^ (5)). 
We now prove the second inequality in (4.5). By (4.4) hit(5/2) < hit[”^ 8 ^(5/4) + 
B y th e first inequality in (3.5) hitj^g ^(5/4) < *£jL i/Jn (<$/8). Hence 

hit21(5/2) < hit^ /8 ^(5/4) + o(^(5)) < (1 + o(l))t^ n (S/8). 

□ 

Proof of Theorem 2. We now prove the equivalence between (i)-(iii) in Theorem 2. We 
defer the equivalence between (iii) and (ii) in Theorem 2 to the end of this section. We now 
prove the equivalence between (i) and (ii). 

By Proposition 4.4 it suffices to show that (i) is equivalent to the condition that 

hit vLn( e ) ~ hit V2,Mn( 1 ~ e ) = ° (kit^J 1 / 4 )) > for any 0 < e < 1/4. (4.7) 
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By Proposition 1.8, 


hit 

hit 


(n) 
1/2,M 
(n) 
1/2, M 


„(3e/2) - 2/^1 log61 < t^ n (e) < hit^ ^(3e/4) + t^j log(16/e) 

(! - e/2) - 2^| log e| < 1 - e) < hitj"^(l - 2e) + ^ log 4. 


Using this together with Proposition 4.4, it is easy to verify that (i) is indeed equivalent to 
(4.7). □ 

We now present two lemmas regarding sets expected hitting times inequalities. Propo¬ 
sition 1.11 follows by combining these two lemmas. The first of which is simpler and gives 
better bounds for some poruses. The second one gives better asymptotic in the sense that 
it follows from it that (in the reversible setup) fnp) — £h( 1 — e) < cf re ie _1 , for some absolute 
constant c, whereas the first lemma only implies that tu(e) — £h( 1 — e) < ct re ie -1 log(l/e). 

Lemma 4.5. Let (fl, P, 7r) be a finite reversible irreducible Markov chain. Let e E (0, 1) 
and k > 0. Let A C D be such it {A) > 1 — e. Then there exists I = I(A,k ) C with 
7r(J) > 1 — y§ 7 F, such that 

E 2 [Ta] < (3 + k)(l - e) _1 f re i log3, for any z E I. 


Proof. Fix k > 0. Let i E N. Denote a = a e := (1 — e) 1 / re ilog3. Consider 

I t = fi(A, k) := {y : [T A > (2 i + k)a] < 3 - *} . 

Then by (4.2) we have that 

7r(/i) > 1 - (3~*) -1 (l - vr (A))e-U+ 2i )(i-d- 1 UA)io g 3 > x _ e (i/ 3 )<+fc. 


Let / := P) ieN Ii- By union bound over the complements we have that 7r(I) > 1 — 
e ^/) igN (l/3) ?+fc = 1 — Moreover, by construction, 

[Ta — ka > i2a] < 3~ l , for all z E I and i G N. 

Hence E^Td] < + ka = (3 + k)( 1 — e) _1 / re i log 3, for all z G I. □ 

Lemma 4.6. Let (O, P, 7r) be a finite irreducible reversible chain. Let A C D be non-empty. 
Denote e := 7r(H). Let t > 1. Denote r = r(e,f) := (f + | loge|)t re i? I '■= Ueilog2 and 
s = s(e) := 2e _1 / re i log 2. There exists a set J = J(A,t) C D such that 

(i) 7r(J) > 1 — 3e~ 2t /A. 

(ii) For any z E J we have that E^Td] < r + ll(s + p/2. 

(Hi) For any z E J and i gN we have that H Z [T A <r + i(£ + s)] > (1 — 2 - *) 2 . 

Proof. For any i G N, let Ci := {y : LL y [T A < is] > 1 — 2”*}. Then by (4.2) we have that 

7r(Q) > 1 — (1 — (1 — 2“ i ))" 1 (l - 7r(H))e _2 * log2 = 1 - (1 - 6)2"/ (4.8) 
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Define 


Ji ■■= {y : H r y +ii [Ci] > 1 — 2-* = 1 — (1 — e)2~ i - 62“*} . 

Denote its complement by Bi. Denote 9l := H r+l Ac t - By stationarity [gi] = E^-[lcJ = 
7r (Ci)). Hence by (4.8) 

Bi C n{x : \ 9i (x) - E w [ 9i \\ 2 > e 2 2~ 2 *}. (4.9) 


By (2.1), 

Vai \ 9i < e - 2( r-+«) / t re i Var 7r l c . < e 2 2- 2i e - 2t 7r(Ci)(l - tt(Q)) < e 2 2~ 3i (l - 2~ i )e~ 2t . 

By Chebyshev’s inequality and (4.9) 

tt {Bi) < e- 2 2 2i Var n9i < 2 _i (l - 2" i ) e - 2t . 

Denote J := P|£i Then by a union bound 

7r( J) > 1 - ^ 7r(Si) > 1 - 3e“ 2 V4. 

ieN 

Fix some z G J and i > 1. Note that because J C J,: we get from the definitions of J* and 
Ci together with the Markov property that 

H . [T A < r + i£+ is] > \l z [X r+ n e C(] min H ? , [TR < is] > (1 — 2 - *) 2 . 

yeCi 

For i = 1 the RHS equals 1/4 and for i > 2 the RHS is at least 1 — 2~ l+l . From this it is 
easy to verify that indeed E z [T A ] < r + ll(s + €)/2. □ 

The following lemma asserts that, for a fixed starting distribution // such that t re \ is much 
smaller than t miX;M , a set A which is “worst” in expectation (i.e. E lt [T A ] = i H)#1 (l — e) and 
tt(A) > 1 — e) is almost the “ worst in probability” (in the sense of Definition 1.1) for all 
times. By this we mean that this is the case up to a small size and time shifts and up to a 
small difference in the chance of not being hit by a given time. 

Lemma 4.7. Let (Q,P,tt) be a finite irreducible reversible chain. Let // e CAifA) and 0 < 
e < 1. Let A C D be such that E /t [T A \ = tH,/e(l — e) and it (A) > 1 — e. Denote p = p(e) : = 
3(1 — e) _1 t rc ilog3. Then 

max H u \T b — T a > rp] < r _1 , for any r > 1. (4-10) 

Bcf!:7r(B)>l—e/2 

In particular, for any t > 0, r > 1 and q G (0,1) we /lave that 

A,, [T a <t\- H ;i [T B < t + rp - 1] < H m [Tb > t + rp, T A <t}< X 1 and ^ 

H m [T a < hiti_ e/2)#t (l - <?) - rp] <q + r~ 1 . 
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Proof. We first note that the first row in (4.11) follows from (4.10) trivially. The second 
row in (4.11) follows from the first by taking t = hiti_ e / 2 )jU (l — q) — rp and picking some 
B C Cl such that n(B) > 1 — e/2 and H m [Tb < hiti_ e / 2 , /i (l — q) — l] < q. 

We now prove (4.10). Let / be as in Lemma 4.5 w.r.t. A with the choice of k — 0. Then 
7r(J) > 1 — e/2 and for any z E I we have E z [Ta] < p. Denote D I D B. Then by the 
assumption that n(B) > 1 — e/2, we have that n(D) > 1 — n(B c ) — tt(H c ) > 1 — e. Hence 

E m [Td] < tu^(l — e) = E M [Th]. (4-12) 

For any £ E M denote £ + := max{f, 0}. Since D C /, by the Markov property we have that 
E m [(T a - T d ) + ] < max ze /E, [Ta] < p. Thus by (4.12) 

E h[(Td — Ta) + } = E <h[Td — Ta] + E ^[(Ta — T D ) + ] < 0 + p = p. 

By Markov inequality and the fact that D C B we get that 

H f,[T B - T a > rp] < H ^[T d - T A > rp] < r _1 , for all r > 1. □ 


We are now ready to conclude the proof of Theorem 2 by establishing the equivalence 
between (ii) and (iii) in Theorem 2. 

Proof. We start by showing that (ii)=>(iii). Let a E (0,1). Assume that 

hitf-^Je) - hiti" } a)Mn (l - e) = o (hit^^l^)) , for any 0 < e < 1/4. 

Then by Proposition 4.4, 

hit £„( e ) - hitgU 1 ~ e ) = 0 (b^Sj 1 / 4 )) ’ for ai W 0 < e < 1/4 and 0 < /? < 1. 

Let A n c Cl n be an arbitrary sequence of sets such that E Mn [Th n ] = (1 — o) and 

n n (A n ) > 1 — a. By the equivalence between (i) and (ii) in Theorem 2, we have that 

*SL„( e ) - “ e ) = for any 0 < e < 1/4. 

Fix some 0 < e < 1/8. Using Proposition 4.4 and similar reasoning as in the proof of the 
equivalence between (i) and (ii) in Theorem 2, we get that 


(l-o(l))* 


(n) 

mix,/i n 


< hit S' 4 (1 


< hit (n) ( 
— 1U h-a,/jd 


< (1 + o(l))t 


(n) 

mix,/i n ? 


and also 


(4.13) 


(1-°(1))*: 


w < hi1 » 


(1 


< hit (n) ( 
— 111X 1 -a/2,nJ 


< (1 + o(l))t 


(n) 

mix,/i n ‘ 


(4.14) 


Let k n (p ) := inf(t : H Mn [Th n > t] < p}. Then by the definition of hit/j^^e) and the fact 
that 7r n (A n ) >1 — q: (first inequality), together with (4.13) we get that 


k n (e) < hit// 


(n) 


Olefin v 


< (1 + o(l))t 


(n) 

mix,/in ’ 


(4.15) 
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Conversely, let 4 = 4(e) := 3e 1 (1 — a) 1 t^”/log3 = o(t^ x ^ n ). Then by (4.11) (first 
inequality) and (4.14) we get that 

Ml - 2e ) > hit[" ) Q/2iAtn (l - e) - 4 > (1 - o(l))^,^- (4-16) 

Whence (4.15) implies that 4(e) - 4(1 - 2e) = o(^ x , Mn )- B Y ( 4 -16) we S et that = 

0 (E Mn [T 4 n ]) and thus we also have that 4( e )“4(l — 2e) = o(E Mn [ThJ), for any 0 < e < 1/8. 
This concludes the proof of (ii)=^(iii). We now show that (iii)=^(ii). 

Let a G (0,1) and A n C be an arbitrary sequence of sets such that E /tn [T 4 J = 
1 — a) and ir n (A n ) > 1 — a. Assume that 

lim H /4I T An ~ ^? Xi/ J < eE Mn [T A J] = 1, for any e > 0. (4.17) 

As before, denote k n (p) := inf{7 : H Mn [T4 > t] < p}. Then by (4.17), 

4(e) - 4(1 - e/ 2 ) = o(4(l - e)), for any 0 < e < 1/4. (4.18) 

Recall that by assumption, there exists some 0 < <5 < 1 such that = o(t^ x (5)). Fix 

some 0 < e < 5/4. By (4.4) we have that 

hit !-a )/in (l - e / 2 ) < hit i- ) «/2,^( 1 - e ) + (4-!9) 

As in (4.15), 

K(l - e/2) < hit‘”4(l - e/2). (4.20) 

By (4.5), we have that 

L"4W<(l + o(l))hi4”4„W2). 

Hence by (4.19)-(4.20) we get that 

4(1 - e/ 2 ) < hit[" ) Q/2>#tn (l - e) + o(hit^ a/ 2 )#lji (<J/ 2 )). (4.21) 

This, in conjunction with (4.18), yields that for any 0 < e < 5/4, 

fc„(e) - Ml - e/ 2 ) = o(hit!" ) „ / 2 , fc (' S / 2 )) (4.22) 

Conversely, let 4(e) be as before. Fix some 0 < e < 5/4. Then 4(e/2) = o(f^ x ^(5)) 
and by (4.5) 4(e/2) = o(hit[j ) a , / , 2 ^(5/2)). Similarly to the derivation of (4.16), by (4.11) 

4(e) > hit [! ) a/2 /in (e/ 2 ) - 4(e/2) = hit (e/2) - o(hit^ a/ 2 .^(5/2)). (4.23) 

This, in conjunction with (4.21)-(4.22), implies that 

hit S-a/2,^( e / 2 ) - ^S-aA/J 1 - 6 ) = °( hit W2,J^ 2 ))- f ° r ^ 0 < 6 < <V 4 ‘ D 
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5 Aldous’ Example 


We now present a version of Aldous’ example (see figure 1) for a sequence of reversible 
Markov chains (Q n ,P n , TT n ) which satisfies the product condition but do not exhibit cutoff 
and analyze it. Our version of Aldous’ demonstrates the behavior described in Proposition 
1.6. Namely, we show that the sequence does not exhibit cutoff although there exist A n C fl n 
with 7 T n (A n ) > 1/2 and x n G satisfying tn(l/2) = E Xn [T aJ such that the hitting times 
of A n started from x n are concentrated under the initial starting positions x n G Q n . 


1/2 



° _iizz 

t BOOn 306n 


Figure 1: Decay in total variation distance for Aldous’ example: it does not have cutoff. 


Example 5.1. Consider the sequence of chains (fl n , P ni 7r n ), where Vt n := AU B U C U {;?}, 
where A A n . }cJ 2 n+i 1 ® 2 n; d 2 n— 1 ; ■ ■ ■ j ®n+ 1 }! B B n . h n ~ 1 ,..., b\ } and C C n . 
{c n , c n _ 1 ,..., Ci}. For notational convenience we write a := a 2n + 1 , v := a n+ i = b n+1 = c n+ i 
and b 0 = z = c 0 . 

Define the transition matrix P n by 
■ Holding probabilities: 

D , x /1/2 xeA n UB n , 

P n lx, x) = < 

v y [99/100 xeC n u{z}. 


■ Values at the special three states a = a 2n +i, v = a n+ 1 , z = b 0 = cq-' P n (a, a 2n ) = 1/2, 


Prfiv 1 a n -\- 2 ) Pn{p 1 bn} B n (y , C n ) , 

6 


Pn(z,bi) 


1 

200 


Pn(z,C 1 ) 


1 

200 ' 
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Figure 2: We consider a Markov chain with the transition probabilities specified above. 


Pnifln+ki Q'n+k— l) Pnipkj^k— l) 1/3 ‘^‘Pni.On+ki ®n+fc+l) 2 Pnipk^k+l) for all 

k e [n]. 

■ P n (c fc ,c fc _i) = y|o = 2Pn(c fc ,c fe+ i) /or all k e [n]. 

Think of Q n as a nearest neighbor walk (on an interval of length 2 n + 1/, biased towards 
state z, which in the middle of the interval (at state v) splits into two parallel paths, B 
and C (which we refer to as “branches”) of length n leading to z (see Figure 2). The 
difference between the two branches is that on branch C the holding probability is much 
larger (i.e. P n {c,c) = 99/100 for all c G C, while P n (b,b) = 1/2 for all b e B). 

Conditionally on not making a lazy step, the chain moves with a fixed bias towards z. 
More precisely, let (f2 n , Q n , ir' n ) be the non-lazy version of (f2 n , P n , ir n ). That is, Q n (x, x) — 0 
for all x and Q n (x,y ) = for all x j- y. Let f : Q n —* {0, 1, ..., 2n + 1} be 

f(bi) = i = f(ci) and f(a n+ \ + i) = n + 1 + i for all 0 < i < n. Let (Y t ) be a realization 
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of (Q n , Q n , 7r' n ). It is easy to see that the projection Z t = f(Y t ) is a nearest neighbor biased 
random walk on the interval { 0 , 1 ,, 2n + 1 } (with reflecting boundary conditions) with a 
fixed bias of 2/3 of making a step towards 0. In particular, T 0 under Rk (w.r.t. the chain 
( Z t )) is concentrated around EflT 0 ] = 3k ±0(1) within a time window of size 0(^/n) (where 
the equality E fc [T 0 ] = 3 k ± 0(1) follows from (5.11) below). 

It is easy to check that the chains (fl n , P n , 7T n ) are indeed reversible. One way to see this 
is to note that Kolmogorov’s cycle condition holds. Alternatively, the corresponding (sym¬ 
metric) edge weights are w n (a n+m , a n+m+ 1 ) = 2~ ( - n+m \ w n (b m , b m+l ) = 2~ m = w n (c m ,c m+1 ) 

Y,y. v tx w n{x,y) xeAuB, 

99J2y:y^x w n( x ,y) otherwise. 

By the well-known discrete analog of Cheeger inequality (e.g. [6] Theorem 13. If), t ( /// = 
0(1), as the bottleneck-ratio is bounded from below (which can readily be seen from the above 
edge weights). In particular, the product condition holds. 

For any 0 < e < 1, let k n (e) := inf{t : max^g^ P X [T Z > t] < e}. As 7i n (z) > 1/2, we get 
that for any 0 < e < 1, hit^ 2 (e) — k n (e). We define CB (a shorthand for “chosen branch”) 
to equal B (resp. C) if the first visit to z was made by crossing the edge (&i, z) (resp. (ci, z)). 

Note that for any x G A we have that H^CB = B] = 1/2 = ffJCB = C]. Let S G { B , C}. 
It is easy to see for every I G [n], conditioned on CB = S, the conditional distribution ofT z 
under H an | CB = S], is concentrated around 61 + 6nls=B + 300nls=c. 

Using the aforementioned projection (Z t ) together with elementary results about hitting 
probabilities for a nearest neighbor biased walk on an interval (see e.g. [6] Example 9.9) we 
get that 

2 ^_i 

H be [T v < T z ] = 2n+1 — - = H CI [T V < T z \, for all I G [n\. (5.1) 

Consequently, 

- = R be [CB = C\, for all I e [n\. (5.2) 

In particular, we get that for all I > |"log 2 n] the law of T z under H Cn+1 _ e (resp. H b n+1 _J is 
concentrated around 300(n — I) (resp. 6 (n — I)), within a time window of size 0(y/n) . 

Let S G {B,C} and I < |"log 2 n~|. It follows from (5.10) below that E Cn+ 1 _ r [r„ | CB = 
B\ < 300r + 0(1). Using Markov inequality, and the analysis of the case x G A, with 
x = a n+ 1 = v, it is easy to verify that conditioned on CB = S, the conditional distribution of 
T z under H Cn+ 1 _J- | CB = S], is concentrated around 6nls=B + 300nl5=c- The same holds 
for b n+ i-£. 

By the analysis above, 

k n ( 1/2 — o(l)) > 306n — o(n) and k n ( 1/2 + o(l)) < 300n + o(n). (5.3) 

In particular, there is no hli\/ 2 -cutoff. By Theorem 1, the sequence does not exhibit a cutoff. 

As 7 T n (z) > 1/2, for any initial state x G Ul n , the “worst” set in expectation of Ti n measure 
at least 1/2, must be { 2 } (i.e. t^ x ( 1/2) = E X [T Z ], for any x G f Inf- 

Let x n G Ut n be such that i^(l/2) = max 2 /e n n E y [T z ] = E Xn [T z ]. We now argue that 
the x n = Cj n for some j n G [n] such that min(n — j n ,jn) —> 00 (in fact, we shall show that 


H c ,[CB=B] = f^C 
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n — j n = 0(logn),). Note that starting from such x n , the hitting time of z is concentrated, 
although the sequence of chain does not exhibit a cutoff. 

Most readers should be satisfied by the following explanation. It is clear that either x n G C 
or x n = a. If I n = o(n ) and I n —> oo, then the distribution ofT z under H Cnj , n is concentrated 
around 300n — o(n) and E Cn _ t [T z \ = 300n — o(n). On the other hand, E tt [T z ] < 159 n. Lastly, 
if I n = 0(1), then H c n+1 _ e [ | CB = B\ is bounded from below, and so limsupE c n+1 _ e [T z \/n < 
300. 

We now present a more detailed proof for the fact that x n = Cj n for some j n such that 
n — j n = @(logn). First write 

2n+l _ 2 r 2 r _ 1 

E Cr .[T z ] = E Cr [T z | T z < T v \ ^ n+1 ~ ~ + (E Cr [T„ | T v < T z } + E V [T Z ]) ^ n+1 _ (5.4) 


We shall show that there exist absolute constants K \, Ji 2 , K 3 > 0 such that for all r G [n 


300 - Ki2~ r < E Cr+1 [T Cr ] < 300, and 6 - Kf2~ r < E an+r+1 [T an+r ],E br+1 [T br ] < 6 . (5.5) 


300n 6 n T ^ „ rrT1 n 300(n +1) 6 (n + 1) , . 

— + Y -I< 2 < E V [T Z ] < - l + = 153 (n + 1 ). 


E a [r z ] — E a [T w ] + E„[T Z ] < 159(n + 1). 
E Cn+1 _ r [T Cn _ r | T z < T v \ = 300 ± K 3 2~ r = E Cr [T Cr+1 


T„ < T z 


E, 


Cn+1 


(5.6) 

(5.7) 

(5.8) 

(5.9) 
(5.10) 

Combining (5.6)-(5.10) with (5.4) it is easy to verify that indeed x n = Cj n for some j n G [n] 
such that n — j n = @(logn). 

We first note that (5.6) and (5.7) follow easily from (5.5). We now prove (5.5). 

It is a standard result (e.g. [2] Lemma 1 Chapter 5, or [3] Lemma 5.2) that for a birth 
and death chain on { 0 , 1 ,..., 2 n + 1 } with symmetric edge weights (wij)ij : \i-j\<i, 


_ r [T z | T z < T v \ = 300(n + 1 - r) ± 2K 3 2~ 
E c „ +1 _ r [T v | T v < T z \ = 300r ± 2K 3 2~ (n ~ r \ 


E r+1 [T r \ = ^Mi>rj>r+l W 'd _ L 

^r,r+l 


(5.11) 


It follows from (5.11) that the projected chain (Z t ) satisfies that 3 — 2~^ 2n ~ k i < E fc+ 1 [T fc ] < 3, 
for all 0 < k < 2n which together with (5.1) imply (5.5). As (5.9) and (5.10) follow from 

(5.8) , we conclude the proof by verifying (5.8). 

Using the Doob’s transform (see e.g. [6] Section 17.6) we have that the law of (X t ) up to 
time T v (resp. T z ) conditioned on T v < T z (resp. T z <T V ) is a Markov chain whose transition 
matrix is given by P v (x,y) := Pn ^ y Jpf^< T *] (j reS p_ P z (x,y) = 

By (5.1) we get that for all r G [n] there exists an absolute constant Ji 4 > 0 such that 
Pz(cn+i—r: y) Pn(cn-\-i—r■> v) i Kq 2 for all ij, while P v (c r , c r _(_j) P n (c r , c r _j) i /i 4 2 for 
i G {0,±1} (i.e., up to negligible terms, P v restricted to C is a nearest neighbor walk with 
an opposite bias compared to the original chain). This, in conjunction with (5.5), implies 

(5.8) . 
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